(19) 



Europaisches Patentamt 
Qfljjl Eur <>pean Patent Office 

Office europeen des brevets 



(12) 



(45) Date of publication and mention 
of the grant of the patent: 
08.05.2002 Bulletin 2002/19 



iiiiiiiiiiiiiiiiiiiiniii 

(n) EP 0 911 808 B1 

EUROPEAN PATENT SPECIFICATION 

(51) mt ci/: G10L 15/26, H04L 12/28 



(21) Application number: 97118470.0 

(22) Date of filing: 23.10.1997 



(54) Speech interface in a home network environment 

Sprachschnittstelle fur ein Hausnetzwerk 
Interface vocale pour un reseau local domestique 



QQ 

oo 
o 

00 



(84) Designated Contracting States: 
DE FR GB 

(43) Date of publication of application: 
28.04.1999 Bulletin 1999/17 

(73) Proprietor: Sony International (Europe) GmbH 
10785 Berlin (DE) 

(72) Inventors: 

• Buchner, Peter, 

c/o Sony Internation.(Europe) GmbH 
70736 Fellbach (DE) 

• Goronzy, Silke, 

c/o Sony Internation. (Europe) GmbH 
70736 Fellbach (DE) 

• Kompe, Ralf, 

c/o Sony International (Europe) GmbH 
70736 Fellbach (DE) 

• Rapp, Stefan, 

c/o Sony International (Europe) GmbH 
70736 Fellbach (DE) 



(74) Representative: Miiller, Frithjof E., Dipl.-lng. 
Muller Hoffmann & Partner 
Patentanwalte 
Innere Wiener Strasse 1 7 
81667 Munchen (DE) 



(56) References cited: 
EP-A- 0 747 881 
DE-U- 29 618 130 



WO-A-96/21990 



• EVANS G: "SOLVING HOME AUTOMATION 
PROBLEMS USING ARTIFICIAL INTELLIGENCE 
TECHNIQUES" IEEE TRANSACTIONS ON 
CONSUMER ELECTRONICS, vol. 37, no. 3, 1 
August 1991, pages 395-400, XP000263213 

■ "DATA RETRIEVAL THROUGH A COMPACT 
DISK DEVICE HAVING A SPEECH 
-DRIVENINTERFACE" IBM TECHNICAL 
DISCLOSURE BULLETIN, vol. 38, no. 1, January 
1995, page 267/268 XP000498766 



Q. 

LU 



Cj) Note: Within nine months from the publication of the mention of the grant of the European patent, any person may give 
q notice to the European Patent Office of opposition to the European patent granted. Notice of opposition shall be filed in 



a written reasoned statement. It shall not be deemed to have been filed until the opposition fee has been paid. (Art. 
99(1) European Patent Convention). 



1 



EP0 911 808 B1 



2 



Description 

[0001] This invention relates to a speech interface in 
a home network environment. In particular, it is con- 
cerned with a speech recognition device, a remotely 
controllable device and a method of self-initialization of 
a speech recognition device. 

[0002] Generally, speech recognizers are known for 
controlling different consumer devices, i.e. television, 
radio, car navigation, mobile telephone, camcorder, PC, 
printer, heating of buildings or rooms. Each of these 
speech recognizers is built into a specific device to con- 
trol it. The properties of such a recognizer, such as the 
vocabulary, the grammar and the corresponding com- 
mands, are designed for this particular task. 
[0003] On the other hand, technology is now available 
to connect different of the above listed consumer devic- 
es via a home network with dedicated bus systems, e. 
g. a IEEE 1394 bus. Devices adapted for such systems 
communicate by sending commands and data to each 
other. Usually such devices identify themselves when 
they are connected to the network and get a unique ad- 
dress assigned by a network controller. Thereafter, 
these addresses can be used by all devices to commu- 
nicate with each other. All other devices already con- 
nected to such a network are informed about address 
and type of a newly connected device. Such a network 
will be included in private homes as well as cars. 
[0004] Speech recognition devices enhance comfort 
and, if used in a car may improve security, as the oper- 
ation of consumer devices becomes more and more 
complicated, e.g. controlling of a car stereo. Also in a 
home network environment e.g. the programming of a 
video recorder or the selection of television channels 
can be simplified when using a speech recognizer. On 
the other hand, speech recognition devices have a rath- 
er complicated structure and need a quite expensive 
technology when a reliable and flexible operation should 
be secured, therefore, a speech recognizer will not be 
affordable for most of the devices listed above. 
[0005] Therefore, it is the object of the present inven- 
tion to provide a generic speech recognizer facilitating 
the control of several devices. Further, it is the object of 
the present invention to provide a remotely controllable 
device that simplifies its network-controllability via 
speech. 

[0006] A further object is to provide a method of self- 
initialization of the task dependent parts of such a 
speech recognition device to control such remotely con- 
trollable devices. 

[0007] These objects are respectively achieved as 
defined in the independent claims 1,4, 14, 15 and 18. 
[0008] Further preferred embodiments of the inven- 
tion are defined in the respective subclaims. 
[0009] The present invention will become apparent 
and its numerous modifications and advantages will be 
better understood from the following detailed descrip- 
tion of an embodiment of the invention taken in conjunc- 



tion with the accompanying drawings, wherein 

Fig. 1 shows a block diagram of an example of a 
speech unit according to an embodiment of 
the invention; 

Fig. 2 shows a block diagram of an example of a 
network device according to an embodiment 
of the invention; 

Fig. 3 shows an example of a wired 1394 network 
having a speech unit and several 1394 de- 



15 Fig. 4 shows an example of a wired 1394 network 
having a speech unit incorporated in a 1 394 
device and several normal 1 394 devices; 

Fig. 5 shows three examples of different types of 
20 networks; 

Fig. 6 shows an example of a home network in a 
house having three clusters; 

25 Fig. 7 shows two examples of controlling a network 
device remotely via a speech recognizer; 

Fig. 8 shows an example of a part of a grammar for 
a user dialogue during a VCR programming; 

30 

Fig. 9 shows an example of a protocol of the inter- 
action between a user, a speech recognizer 
and a network device; 

35 Fig. 10 shows an example of a learning procedure of 
a connected device, where the name of the 
device is determined automatically; 

Fig. 1 1 shows an example of a protocol of a notifica- 
40 tion procedure of a device being newly con- 

nected, where the user is asked for the name 
of the device; 

Fig. 12 shows an example of a protocol of the inter- 
ns action of multiple devices for vocabulary ex- 
tensions concerning media contents; and 

Fig. 13 shows another example of a protocol of the 
interaction of multiple devices for vocabulary 
50 extensions concerning media contents. 

[0010] Fig. 1 shows a block diagram of an example of 
the structure of a speech unit 2 according to the inven- 
tion. Said speech unit 2 is connected to a microphone 
55 1 and a loudspeaker, which could also be built into said 
speech unit 2. The speech unit 2 comprises a speech 
synthesizer, a dialogue module, a speech recognizer 
and a speech interpreter and is connected to an IEEE 
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1 394 bus system 1 0. It is also possible that the micro- 
phone 1 and/or the loudspeaker are connected to the 
speech unit 2 via said bus system 1 0. Of course it is then 
necessary that the microphone 1 and/or the loudspeak- 
er are respectively equipped with a circuitry to commu- 5 
nicate with said speech unit 2 via said network, such as 
A/D and D/A converters and/or command interpreters, 
so that the microphone 1 can transmit the electric sig- 
nals corresponding to received spoken utterances to the 
speech unit 2 and the loudspeaker can output received 
electric signals from the speech unit 2 as sound. 
[0011] IEEE 1394 is an international standard, low- 
cost digital interface that will integrate entertainment, 
communication and computing electronics into consum- 
er multimedia. It is a low-cost easy-to-use bus system, 
since it allows the user to remove or add any kind of 
1 394 devices with the bus being active. Although the 
present invention is described in connection with such 
an IEEE 1394 bus system and IEEE 1394 network de- 
vices, the proposed technology is independent from the 
specific IEEE 1394 standard and is well-suited for all 
kinds of wired or wireless home networks or other net- 
works. 

[0012] As will be shown in detail later, a speech unit 
2, as shown in Fig. 1 is connected to the home network 
10. This is a general purpose speech recognizer and 
synthesizer having a generic vocabulary. The same 
speech unit 2 is used for controlling all of the devices 11 
connected to the network 10. The speech unit 2 picks 
up a spoken-command from a user via the microphone 
1, recognizes it and converts it into a corresponding 
home network control code, henceforth called user-net- 
work-command, e.g. specified by the IEEE 1394 stand- 
ard. This control code is then sent to the appropriate 
device that performs the action associated with the user- 
network-command. 

[0013] To be capable of enabling all connected net- 
work devices to be controlled by speech, the speech unit 
has to "know" the commands that are needed to provide 
operability of all individual devices 11. Initially, the 
speech unit "knows" a basic set of commands, e.g., 
commands that are the same for various devices. There 
can be a many-to-one mapping between spoken-com- 
mands from a user and user-network-commands gen- 
erated therefrom. Such spoken-commands can e.g. be 
play, search for radio station YXZ or (sequences of) 
numbers such as phone numbers. These commands 
can be spoken in isolation or they can be explicitely or 
implicitely embedded within full sentences. Full sen- 
tences will henceforth as well be called spoken-com- 
mand. 

[0014] In general, speech recognizers and technolo- 
gies for speech recognition, interpretation, and dia- 
logues are well-known and will not be explained in detail 
in connection with this invention. Basically, a speech 
recognizer comprises a set of vocabulary and a set of 
knowledge-bases (henceforth grammars) according to 
which a spoken-command from a user is converted into 



a user-network-command that can be carried out by a 
device. The speech recognizer also may use a set of 
alternative pronunciations associated with each vocab- 
ulary word. The dialogue with the user will be conducted 
according to some dialogue model. 
[0015] The speech unit 2 according to an embodiment 
of the invention comprises a digital signal processor 3 
connected to the microphone 1 . The digital signal proc- 
essor 3 receives the electric signals corresponding to 
the spoken-command from the microphone 1 and per- 
forms a first processing to convert these electric signals 
into digital words recognizable by a central processing 
unit 4. To be able to perform this first processing, the 
digital signal processor 3 is bidirectionally coupled to a 
memory 8 holding information about the process to be 
carried out by the digital signal processor 3 and a 
speech recognition section 3a included therein. Further, 
the digital signal processor 3 is connected to a feature 
extraction section 7e of a memory 7 wherein information 
is stored of how to convert electric signals correspond- 
ing to spoken-commands into digital words correspond- 
ing thereto. In other words, the digital signal processor 
3 converts the spoken-command from a user input via 
the microphone 1 into a computer recognizable form, e. 
g. text code. 

[0016] The digital signal processor 3 sends the gen- 
erated digital words to the central processing unit 4. The 
central processing unit 4 converts these digital words 
into user-network-commands sent to the home network 
system 10. Therefore, the digital signal processor 3 and 
the central processing unit 4 can be seen as speech rec- 
ognizer, dialogue module and speech interpreter. 
[0017] It is also possible that the digital signal proces- 
sor 3 only performs a spectrum analysis of the spoken- 
command from a user input via the microphone 1 and 
the word recognition itself is conducted in the central 
processing unit 4 together with the convention into user- 
network-commands. Depending on the capacity of the 
central processing unit 4, it can also perform the spec- 
trum analysis and the digital signal processor 3 can be 
omitted. 

[001 8] Further, the central processing unit 4 provides 
a learning function for the speech unit 2 so that the 
speech unit 2 can learn new vocabulary, grammar and 
user-network-commands to be sent to a network device 
11 corresponding thereto. To be able to perform these 
tasks the central processing unit 4 is bidirectionally cou- 
pled to the memory 8 that is also holding information 
about the processes to be performed by the central 
processing unit 4. Further, the central processing unit 4 
is bidirectionally coupled to an initial vocabulary section 
7a, an extended vocabulary section 7b, an initial gram- 
mar section 7c, an extended grammar section 7d and a 
software section 7f that comprises a recognition section 
and a grapheme-phoneme conversion section of the 
memory 7. Further, the central processing unit 4 is bidi- 
rectionally coupled to the home network system 10 and 
can also send messages to a digital signal processor 9 
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included in the speech unit 2 comprising a speech gen- 
eration section 9a that serves to synthesize messages 
into speech and outputs this speech to a loudspeaker. 
[0019] The central processing unit 4 is bidirectionally 
coupled to the home network 1 0 via a link layer control 
unit 5 and an l/F physical layer unit 6. These units serve 
to filter out network-commands from bus 10 directed to 
the speech unit 2 and to address network-commands to 
selected devices connected to the network 10. 
[0020] Therefore, it is also possible that new user-net- 
work-commands together with corresponding vocabu- 
lary and grammars can be learned by the speech unit 2 
directly from other network devices. To perform such a 
learning, the speech unit 2 can send control commands 
stored in the memory 8 to control the network devices, 
henceforth called control-network-commands, to re- 
quest their user-network-commands and corresponding 
vocabulary and grammars according to which they can 
be controlled by a user. The memory 7 comprises an 
extended vocabulary section 7b and an extended gram- 
mar section 7d to store newly input vocabulary or gram- 
mars. These sections are respectively designed like the 
initial vocabulary section 7a and the initial grammar sec- 
tion 7c, but newly input user-network-commands to- 
gether with information needed to identify these user- 
network-commands can be stored in the extended vo- 
cabulary section 7b and the extended grammar section 
7d by the central processing unit 4. In this way, the 
speech unit 2 can learn user-network-commands and 
corresponding vocabulary and grammars built into an 
arbitrary network device. New network devices have 
then no need to have a built-in speech recognition de- 
vice, but only the user-network-commands and corre- 
sponding vocabulary and grammars that should be con- 
trollable via a speech recognition system. Further, there 
has to be a facility to transfer these data to the speech 
unit 2. The speech unit 2 according to the invention 
learns said user-network-commands and correspond- 
ing vocabulary and grammar and the respective device 
can be voice-controlled via the speech unit 2. 
[0021] The initial vocabulary section 7a and the initial 
grammar section 7c store a basic set of user-network- 
commands that can be used for various devices, like us- 
er-network-commands corresponding to the spoken- 
commands switch on, switch off, pause, louder, etc., 
these user-network-commands are stored in connection 
with vocabulary and grammars needed by the central 
processing unit 4 to identify them out of the digital words 
produced by the speech recognition section via the dig- 
ital signal processor 3. Further, questions or messages 
are stored in a memory. These can be output from the 
speech unit 2 to a user. Such questions or messages 
may be used in a dialogue in-between the speech unit 
2 and the user to complete commands spoken by the 
user into proper user-network-commands, examples 
are please repeat, which device, do you really want to 
switch off?, etc. All such messages or questions are 
stored together with speech data needed by the central 



processing unit 4 to generate digital words to be output 
to the speech generation and synthesis section 9a of 
the digital signal processor 9 to generate spoken utter- 
ances output to the user via the loudspeaker. Through 

s the microphone 1 , the digital signal processors 3 and 9 
and the loudspeaker a "bidirectional coupling" of the 
central processing unit 4 with a user is possible. There- 
fore, it is possible that the speech unit 2 can communi- 
cate with a user and learn from him or her. Like in the 

10 case of the communication with a network device 11, 
the speech unit 2 can access a set of control-network- 
commands stored in the memory 8 to instruct the user 
to give certain information to the speech unit 2. 
[0022] As stated above, also user-network-com- 

15 mands and the corresponding vocabulary and gram- 
mars can be input by a user via the microphone 1 and 
the digital signal processor 3 to the central processing 
unit 4 on demand of control-network-commands output 
as messages by the speech unit 2 to the user. After the 

20 user has uttered a spoken-command to set the speech 
unit 2 into learning state with him, the central processing 
unit 4 performs a dialogue with the user on the basis of 
control-network-commands stored in the memory 8 to 
generate new user-network-commands and corre- 

25 sponding vocabulary to be stored in the respective sec- 
tions of the memory 7. 

[0023] It is also possible that the process of learning 
new user-network-commands is done half-automatical- 
ly by the communication in-between the speech unit 2 

30 and an arbitrary network device and half-dialogue con- 
trolled between the speech unit 2 and a user. In this way, 
user-dependent user-network-commands for selected 
network devices can be generated. 
[0024] As stated above, the speech unit 2 processes 

35 three kinds of commands, i.e. spoken-commands ut- 
tered by a user, user-network-commands, i.e. digital sig- 
nals corresponding to the spoken-commands, and con- 
trol-network-commands to perform a communication 
with other devices or with a user to learn new user-net- 

40 work-commands from other devices 1 1 and to assign 
certain functionalities thereto so that a user can input 
new spoken-commands or to assign a new functionality 
to user-network-commands already included. 
[0025] Output of the speech unit directed to the user 

45 are either synthesized speech or pre-recorded utteranc- 
es. A mixture of both might be useful, e.g. pre-recorded 
utterances for the most frequent messages and synthe- 
sized speech for other messages. Any network device 
can send messages to the speech unit. These messag- 

50 es are either directly in orthographic form or they encode 
or identify in some way an orthographic message. Then 
these orthographic messages are output via a loud- 
speaker, e.g. included in the speech unit 2. Messages 
can contain all kinds of information usually presented 

55 on a display of a consumer device. Furthermore, there 
can be questions put forward to the user in course of a 
dialogue. As stated above, such a dialogue can also be 
produced by the speech unit 2 itself to verify or confirm 
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spoken-commands or it can be generated by the speech 
unit 2 according to control-network-commands to learn 
new user-network-commands and corresponding vo- 
cabulary and grammars. 

[0026] The speech input and/or output facility, i.e. the 5 
microphone 1 and the loudspeaker, can also be one or 
more separate device(s). In this case messages can be 
communicated in orthographic form in-between the 
speech unit and the respective speech input and/or out- 
put facility. 

[0027] Spoken messages sent from the speech unit 
2 itself to the user, like which device should be switched 
on?, could also be asked back to the speech unit 2, e. 
g. which network device do you know?, and first this 
question could be answered by the speech unit 2 via 
speech, before the user answers the initial spoken mes- 
sage sent from the speech unit. 
[0028] Fig. 2 shows a block diagram of an example of 
the structure of remotely controllable devices according 
to an embodiment of this invention, here a network de- 
vice 11. This block diagram shows only those function 
blocks necessary for the speech controllability. A central 
processing unit 1 2 of such a network device 1 1 is con- 
nected via a link layer control unit 17 and an l/F physical 
layer unit 16 to the home network bus 10. Like in the 
speech unit 2, the connection in-between the central 
processing unit 1 2 and the home network bus 1 0 is bi- 
directional so that the central processing unit 12 can re- 
ceive user-network-commands and control-network- 
commands and other information data from the bus 1 0 
and send control-network-commands, messages and 
other information data to other network devices or a 
speech unit 2 via the bus 10. Depending on the device, 
it might also be possible that it will also send user-net- 
work-commands. The central processing unit 12 is bidi- 
rectionally coupled to a memory 14 where all information 
necessary for the processing of the central processing 
unit 12 including a list of control-network-commands 
needed to communicate with other network devices is 
stored. Further, the central processing unit 12 is bidirec- 
tionally coupled to a device control unit 15 controlling 
the overall processing of the network device 1 1 . A mem- 
ory 1 3 holding all user-network-commands to control the 
network device 11 and the corresponding vocabulary 
and grammars is also bidirectionally coupled to the cen- 
tral processing unit 12. These user-network-commands 
and corresponding vocabularies and grammars stored 
in the memory 1 3 can be down-loaded into the extended 
vocabulary section 7b and the extended grammar sec- 
tion 7d of the memory 7 included in the speech unit 2 in 
connection with a device name for a respective network 
device 11 via the central processing unit 12 of the net- 
work device 11 , the link layer control unit 17 and the l/F 
physical layer unit 16 of the network device 11 , the home 
network bus system 10, the l/F physical layer unit 6 and 
the link layer control unit 5 of the speech unit 2 and the 
central processing unit 4 of the speech unit 2. In this way 
all user-network-commands necessary to control a net- 



work device 11 and corresponding vocabulary and 
grammars are learned by the speech unit 2 according 
to the present invention and therefore, a network device 
according to the present invention needs no built-in de- 
vice dependent speech recognizer to be controllable via 
speech, but just a memory holding all device dependent 
user-network-commands with associated vocabulary 
and grammars to be down-loaded into the speech unit 
2. It is to be understood that a basic control of a network 
device by the speech unit 2 is also given without vocab- 
ulary update information, i.e. the basic control of a net- 
work device without its device dependent user-network- 
commands with associated vocabulary and grammars 
is possible. Basic control means here to have the pos- 
sibility to give commands generally defined in some 
standard, like switch-on, switch-off, louder, switch chan- 
nel, play, stop, etc.. 

[0029] Fig. 3 shows an example of a network archi- 
tecture having an IEEE 1394 bus and connected thereto 
one speech unit 2 with microphone 1 and loudspeaker 
and four network devices 11 . 

[0030] Fig. 4 shows another example of a network ar- 
chitecture having four network devices 11 connected to 
an IEEE 1394 bus. Further, a network device 4 having 
a built-in speech unit with microphone 1 and loudspeak- 
er is connected to the bus 31. Such a network device 
41 with a built-in speech unit has the same functionality 
as a network device 11 and a speech unit 2. Here, the 
speech unit controls the network device 11 and the net- 
work device 41 which it is built-in. 
[0031] Fig. 5 shows further three examples for net- 
work architectures. Network A is a network similarto that 
shown in Fig. 3, but six network devices 11 are connect- 
ed to the bus 31. In regard to the speech unit 2 that is 
also connected to the bus 31 , there is no limitation of 
network devices 11 controllable via said speech unit 2. 
Every device connected to the bus 31 that is controllable 
via said bus 31 can also be controlled via the speech 
unit 2. 

[0032] Network B shows a different type of network. 
Here, five network devices 11 and one speech unit 2 are 
connected to a bus system 51 . The bus system 51 is 
organized so that a connection is only necessary in-be- 
tween two devices. Network devices not directly con- 
nected to each other can communicate via other third 
network devices. Regarding the functionality, network B 
has no restrictions in comparison to network A. 
[0033] The third network shown in Fig. 5 is a wireless 
network. Here, all devices can directly communicate 
with each other via a transmitter and a receiver built into 
each device. This example shows also that several 
speech units 2 can be connected to one network. Those 
speech units 2 can have both the same functionality or 
both different functionalities, as desired. In this way, it is 
also easily possible to build personalized speech units 
2 that can be carried by respective users and that can 
control different network devices 11. as desired by the 
user. Of course, personalized speech units can also be 
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used in a wired network. In comparison to a wireless 
speech input and/or output facility a personalized 
speech unit has the advantage that it can automatically 
log-into another network and all personalized features 
are available. 5 
[0034] Such a personalized network device can be 
constructed to translate only those spoken-commands 
of a selected user into user-network-commands using 
speaker-adaption or speaker-verification. This enables 
a very secure access policy in that an access is only 10 
allowed if the correct speaker uses the correct speech 
unit. Of course, all kinds of accesses can be controlled 
in this way, e.g. access to the network itself, access to 
devices connected to the network, like access to rooms, 
to a VCR, to televisions and the like. f 5 

[0035] Further, electronic phone books my be stored 
within the speech unit. Calling functions by name, e.g. 
office, is strongly user-dependent and therefore such 
features will be preferably realized in personalized 
speech units. Also spoken-commands as switch on my 20 
TV can easily be assigned to the correct user-network- 
commands controlling the correct device, as it may be 
the case that different users assign different logical 
names therefore and the speech unit 2 has to generate 
the same user-network-command when interpreting dif- 25 
ferent spoken-commands. On the other hand, it is pos- 
sible that the network e.g. comprises more than one de- 
vice of the same type, e.g. two TVs, and the speech unit 
2 has to generate different user-network-commands 
when interpreting the same spoken-command uttered 30 
by different users, e.g. switch on my TV. 
[0036] One speech unit can contain personalized in- 
formation of one user or different users. In most cases 
the personalized speech unit corresponding to only one 
user will be portable and wireless, so that the user can 35 
take it with him/her and has the same speech unit at 
home, in the car or even other networks, like in his/her 
office. 

[0037] The personalized speech unit can be used for 
speaker verification purposes. It verifies the words of a 40 
speaker and allows the control of selected devices. This 
can also be used for controlling access to rooms, cars 
or devices, such as phones. 

[0038] A personalized speech unit can contain a 
speech recognizer adapted to one person which strong- 45 
ly enhances the recognition performance. 
[0039] Fig. 6 shows an example of a home network 
consisting of three clusters. One of the clusters is built 
by an IEEE 1394 bus 61 installed in a kitchen of the 
house. Connected to this bus is a broadcast receiver 65, 50 
a digital television 64, a printer 63, a phone 62 and a 
long distance repeater 66. This cluster has also connec- 
tions to a broadcast gateway 60 to the outside of the 
house and via the repeater 66 and an IEEE 1394 bridge 
74 to the cluster "sitting room" in which also an IEEE 55 
1394 bus 67 is present. Apart from the bridge 74, a 
speech unit 70, a personal computer 69, a phone 68, a 
VCR 71 , a camcorder 72 and a digital television 73a is 



connected to the bus 67. The bridge 74 is also connect- 
ed to the third cluster "study" which comprises an IEEE 
1394 bus 78 connected to the bridge 74 via a long dis- 
tance repeater 75. Further, a PC 76, a phone 77, a hard 
disc 79, a printer 80, a digital television 81 and a tele- 
phone NIU 82 are connected to said bus 78. A telephone 
gateway 83 is connected to the telephone NIU 82. 
[0040] The above described network is constructed 
so that every device can communicate with the other 
devices via the IEEE 1394 system, the bridge 74 and 
the repeaters 66 and 75. The speech unit 70 located in 
the sitting room can communicate with all devices and 
therewith have the possibility to control them. This 
speech unit 70 is built like the speech unit 2 described 
above. Since in the example shown in Fig. 6 several de- 
vices of the same type are present, e.g., the digital tel- 
evision 30 in the sitting room and the digital television 
81 in the study, it is possible to define user defined de- 
vice names. When the network is set-up or when a de- 
vice is connected to the network having already a device 
of this type connected thereto, the speech unit 70 will 
ask the user for names for these devices, e.g. television 
in the sitting room and television in the study to be as- 
signed to the individual devices. To be able to recognize 
these names, one of the following procedures has to be 
done. 

1 . The user has to enter the orthographic form (se- 
quence of letters) of the device name by typing or 
spelling. The speech unit 70 maps the orthographic 
form into phoneme or model sequence; 

2. In the case of a personalized speech unit, the us- 
er utterance corresponding to the device name can 
be stored as a feature vector sequence, that is di- 
rectly used during recognition as reference pattern 
in a pattern matching approach; 

3. The phoneme sequence corresponding to the 
name can be learned automatically using a pho- 
neme recognizer. 

[0041] The user has then only to address these de- 
vices by name, e.g. television in the sitting room. The 
speech unit 70 maps the name to the appropriate net- 
work address. By default, the name corresponds to the 
functionality of the device. All commands uttered by a 
user are sent to the device named at last. Of course it 
is also possible that these names are changed later on. 
[0042] In many situations a person might wish to ac- 
cess his device at home over the phone, e.g. to retrieve 
faxes or to control the heating remotely. Two alternative 
architectures to realize such a remote access are illus- 
trated in Fig. 7. 

[0043] Fig. 7a shows that a speech unit 2 is connected 
to the home network having a network device 11 and to 
the public telephone network. A spoken-command from 
a user is transmitted via the public telephone network to 
the speech unit 2 that translates the spoken-command 
into a user-network-command to control the network de- 
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vice 11 . A user can control his home network independ- 
ently from any other devices but the speech unit 2 from 
any place he likes when he has an access to the public 
telephone network. 

[0044] Fig. 7b shows another example in which a user 5 
having a personalized speech unit 2 is within the recep- 
tion area of an arbitrary home network A. He utters a 
spoken-command into his personalized speech unit 2 
that translates said spoken-command into a user-net- 
work-command and sends it to the home network A. The 10 
home network A sends the generated user-network- 
command via the transceivers 84 and the public tele- 
phone network to a home network B in which the net- 
work device 11 is located that gets controlled by the 
translated spoken-command uttered by the user. Of 15 
course, these features strongly depend on the available 
networks. 

[0045] As described above, the speech unit 2 has a 
speech output facility either included or connected 
thereto directly or via a network so that messages from 20 
network devices can be synthesized into uttered se- 
quences and a dialogue in-between the speech unit 2 
and the user can be performed. Such a dialogue would 
for example also be useful for the programming of a 
VCR. The messages can also provide additional infor- 25 
mation about the status of the network device, e.g. the 
titles of the CDs contained in a juke box. In general, the 
number and type of message is not fixed. 
[0046] Fig. 8 shows examples for a part of a grammar 
for a user dialogue during VCR programming. "S" are 30 
system questions; "U" denotes spoken-commands or 
other user utterances. Possible spoken-commands or 
user utterances at each dialogue step are defined by the 
word grammar and the vocabularies. 
[0047] Grammars, e.g. finite state transition gram- 35 
mars, are used to restrict the set of word sequences to 
be recognized or to specify a sequence of dialogue 
steps, e.g. needed to program a video recorder. A dif- 
ferent finite state grammar may be specified to each di- 
alogue step. These grammars are directly used by the *o 
speech unit. On the other hand, these grammars are en- 
tirely device-dependent. Therefore, it is not practical to 
have static finite state grammars in the speech unit. It is 
rather proposed that a device newly connected to the 
network can send its specific set of grammars to the 45 
speech unit. 

[0048] As it is shown in the above part of Fig. 8, a di- 
alogue grammar could be that in a step S1 the system 
asks the user for a channel, i.e. outputs a message 
channel? to the user. In a following step S2 the user in- 50 
puts a word sequence U_CHANNEL to the system as 
spoken-command. Thereafter, in step S3 the system 
asks if the action to be programmed should be taken 
today. In the following step S4 the user inputs a word 
sequence U_Y/N_DATE to the system, telling yes. no 55 
or the date at which the action to be programmed should 
take place. If the date corresponds to today or the user 
answers the question of the system with yes, the system 



asks in a following step S5 which movie. Thereafter, in 
a step S6 the user informs the system about the film with 
a word sequence U_FILM. If the user has answered no 
in step S4, the system asks for the date in step S7. In 
the following step S8 the user inputs the date to the sys- 
tem as spoken-command in a word sequence U_DATE, 
thereafter the process flow continues with step S5. 
[0049] In the middle of Fig. 8 examples of the gram- 
marfor word sequences corresponding to the above ex- 
ample are shown. In step S4 the user can input yes, no 
or a date as a word sequence U_Y/N_DATE. Therefore, 
as it is shown in the first line for the word sequence U_Y/ 
N_DATE, the user has the possibility to input a word se- 
quence U_Y/N in a step S41 or a word sequence 
U_DATE in a step S42. In the second line for the word 
sequence U_Y/N the two possibilities for the user input 
are shown, namely the word sequence U_NO in a step 
S43 or the word sequence U_YES in a step S44. The 
third line for the word sequence U_DATE shows the pos- 
sible word sequences for user inputs for a date, here a 
sequence of two numbers is allowed as input, a first 
number corresponding to a word sequence NO_1_31 in 
a step S45 and a second number corresponding to a 
word sequence NO_1-12 in a step S46. 
[0050] The lower part of Fig. 8 shows vocabularies 
corresponding to these word sequences. For example, 
the word sequence U_YES can be represented by the 
words yes or yeh, the word sequence U_NO can be rep- 
resented by the vocabulary no, the word sequence 
NO_1_31 can be represented by the vocabularies one, 
two, ... thirty-one, first, second, thirty-first and the 
word sequence NO_1_12 can be represented by the vo- 
cabularies one, ... twelve, first. ... twelfth. 
[0051] Fig. 9 shows an example or the interaction be- 
tween a user, a speech unit and a network device. 
[0052] First, the user utters the spoken-command 
play. In the shown case, the speech unit knows that 
more than one device connected to the network can be 
played. It determines that the spoken-command play 
does not comprise enough information to control a spe- 
cific network device. Therefore, it outputs the message 
which device should be played? to the user. The answer 
to this message of the user to the speech unit is VCR. 
Now the speech unit determines that the user did pro- 
vide enough information to control a specific network de- 
vice as desired, here to set the VCR into the play state. 
Therefore, it transmits the corresponding user-network- 
command PLAY to the VCR address via the network. 
The VCR receives this user-network-command and 
tries to perform the associated action. In the shown case 
the VCR cannot detect a cassette, therefore it cannot 
be set into the play state and sends an error message 
to the device from which the user-network-command 
PLAY was received. In this case, an error ID X is sent 
to the device address of the speech unit. The speech 
unit receives this error message, recognizes it and out- 
puts a corresponding message sorry, there is no cas- 
sette in the VCR to the user. 
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[0053] The speech unit acts as an interface in-be- 
tween the network, including all network devices con- 
nected thereto, and one or more users. The users just 
have to utter spoken-commands to the speech unit con- 
nected to the network when they are within the reception 5 
area thereof and the speech unit that basically knows 
the state of the network or can generate it. verifies if 
complete user-network-commands can be generated or 
otherwise sends a message to the user precisely asking 
for a missing part of a message or spoken-command to 10 
be able to properly generate the corresponding user- 
network-command. 

[0054] The speech unit has to keep track of devices 
connected to the network to eventually retrieve and un- 
derstand new functionalities. Therefore the speech unit 15 
will check all connected devices for new speech control 
functionality. An exemplary process flow is shown in Fig. 
1 0. It also has to keep track if devices are disconnected 
from the network. 

[0055] First, the speech unit sends a request for the 20 
ID, including name and device type, to the device ad- 
dress of a network device connected to the network. In 
this state, the network device cannot be controlled by 
speech. After the device has received the request for its 
ID from the speech unit, it sends its ID to the address of 25 
the speech unit. Thereafter, the speech unit sends a re- 
quest for the user-network-command list of the device 
to the corresponding device address. Having received 
this request, the network device sends its user-network- 
command list to the speech unit, the speech unit re- 30 
ceives the user-network-command list, updates its vo- 
cabulary with the vocabulary and grammars received 
from the device and sends an acknowledgement receipt 
to the device address of the device. The device can now 
be controlled by speech. Preferably the speech unit no- 35 
tifies the user that a new device providing new speech 
control functionality is available after such a procedure. 
[0056] If a new device is connected to the network it 
is possible that it broadcasts its ID, comprising network 
address, name and device type. Fig. 11 shows an ex- to 
ample of such an initialization. Here it is shown that the 
device offering the new speech control functionality 
gives some kind of notification to the speech unit, then 
after sending the user-network-command list request 
and receiving the user-network-command list the 45 
speech unit asks a user to give a logical name for the 
newly connected device. The user then types or spells 
the name of the newly connected device so that the 
speech unit can receive it. Of course, it is also possible 
that the user just utters the new name. The logical name so 
given by the user is received by the speech unit that up- 
dates the vocabulary and grammars and sends a con- 
firmation of reception to the IEEE 1394 device that has 
been newly connected to the network. This device can 
now be controlled by speech. 55 
[0057] The command list sent from the device to the 
speech unit can either exist of only the orthographic form 
of the spoken-commands in conjunction with the appro- 



priate user-network-command or it additionally provides 
the pronunciation, e.g. phonemic transcriptions, for 
these spoken-commands. The speech units present vo- 
cabulary is then extended with these new user-network- 
commands. In case the user-network-command list only 
gave the orthography of the spoken-commands but not 
the transcriptions, a built-in grapheme-to-phoneme con- 
version section 7f generates the pronunciations and 
their variations and thus completes the user-network- 
command list. After updating the vocabulary and gram- 
mars the new device can be fully controlled by speech. 
[0058] If such a handshake procedure in-between a 
newly connected device and the speech unit is not per- 
formed, only a basic functionality of the device is pro- 
vided by some user-network-commands stored in the 
initial vocabulary contained in the speech unit that 
matches to the user-network-commands of said device. 
It is also possible that user-network-commands used for 
other devices can be adapted to the new device, but the 
full controllability by speech cannot be guaranteed in 
this way. Still the speech unit has to know the ID of said 
device to have an access, so some kind of communica- 
tion in-between the speech unit and the device or an- 
other device knowing the ID has to take place. 
[0059] Commands that include media descriptions, e. 
g., the name of a CD, song titles, movie titles, or station 
names induce vocabularies that are in part unknown to 
the speech unit. Hence, this information has to be ac- 
quired from other sources. Current state of the art is that 
the user enters this information by typing or spelling. The 
speech unit according to the invention, on the other 
hand can dynamically create the vocabulary and/or 
grammars similar to the processes as described above. 
The name and/or pronunciation of a media description 
or program name is acquired in one of the following 
ways: 

From a database delivered by someone on some 
media, e.g. CD-ROM; 

the medium, e.g. CD, DVB, itself holds its descrip- 
tion and optionally also the pronunciation of its de- 
scription, e.g. artists names and song titles are ma- 
chine readable included on a CD; 
from a database accessed over an information 
transport mechanism, e.g. the internet, DAB, a 
home network, telephone lines. 

Besides these methods the user might enter it by typing 
or spelling. 

[0060] To acquire such information, the speech unit 
or any other device issues an information seeking re- 
quest asking for the content of a medium or a program, 
e.g., when a new CD is inserted in a player for the first 
time, when a device capable of receiving programs is 
attached to the bus, to other connected devices, e.g. all 
devices of the home network. There might be more than 
one device that tries to answer the request. Possible de- 
vices might be for example: 
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A device capable of reading a delivered database, 
e.g. a CD-ROM contained in a disc jukebox, a video 
tape player telling the content of the tape, a data 
base that has been entered by the user, e.g. on his 
PC, a set top box telling the channels, i.e. program 
names, it can receive: 

a device connected to another information transport 
mechanism, e.g. a WEB-TV, a set-top-box, a DAB 
receiver, a PC, that at least sometimes is connected 
to the internet or has a modem to connect to a sys- 
tem holding program channel information or other 
information; 

a device communicating with the user that queries 
a content, e.g. by asking him/her how a frequently 
played song is called, what program he is currently 
watching, etc., a dialogue initiated by the user about 
a newly bought media and the wish to enter the titles 
by typing, spelling or speaking. 

[0061] Fig. 12 shows an example for the interaction 
of multiple devices for vocabulary extensions concern- 
ing media contents. The information is delivered by nei- 
ther the speech unit nor the device holding the media in 
this case. After a new medium is inserted for the first 
time in a media player, the media player sends a notifi- 
cation of insertion of medium X to the speech unit. The 
speech unit then sends a content query for medium X 
in form of a control-network-command to the media 
player and to all other connected network devices. One 
of the other connected network devices sends thereafter 
the content information for medium X to the speech unit. 
The speech unit updates its vocabulary and sends an 
acknowledge receipt to the media player and the other 
connected network device that has sent the content in- 
formation for medium X. The medium content can now 
be accessed by spoken-commands, e.g. play 
Tschaikowsky Piano Concert b-minor. 
[0062] Fig. 1 3 shows another example for the interac- 
tion of multiple devices for vocabulary extension con- 
cerning media contents. In this case two devices answer 
the query. The first answer is chosen to update the vo- 
cabulary while the second answer is discarded. 
[0063] After a new medium is inserted for the first time 
in a media player, the media player sends a notification 
of insertion of medium X in form of a control-network- 
command to the speech unit. The speech unit sends 
then a content query for medium X to the media player 
and all other connected network devices. In this case 
the media player sends the content information for me- 
dium X, since the content description is entailed on the 
medium. The speech unit then updates its vocabulary 
and/or grammars and sends an acknowledge receipt in 
form of a control-network-command to the media player. 
If the content information for medium X is thereafter de- 
livered by another connected network device, the 
speech unit discards this information. 
[0064] It might also be possible that a database deliv- 
ered on some medium, e.g. a CD-ROM, or a database 



stored in the internet, i.e. an internet page, or transmit- 
ted via digital broadcasting contains the user-network- 
commands and corresponding vocabulary and/or gram- 
mars of a remotely controllable network device, in this 

5 case this information can be downloaded by the speech 
unit 2 like the media descriptions, e.g. when a new de- 
vice 11 is connected to the network or when a user ini- 
tiates a vocabulary update. Such devices need not to 
carry this information in a memory, but it can be deliv- 

io ered with the device 11 on a data carrier that can be 
read by another device 1 1 connected to the network or 
it can be supplied by the manufacturer of the device via 
the internet or digital broadcasting. 



Claims 

1. Speech unit (2) comprising a speech recognition 
device, connected to a microphone (1 ) for generat- 

20 ing user-network-commands according to electric 
signals provided by said microphone (1) to control 
a remotely controllable device (11) connected to 
said speech unit (2), characterized by 

a control unit (4) to send control-network- 

25 commands to said device (11) connected to said 
speech recognition device so that said device (11) 
transmits device or medium dependent vocabulary 
and/or grammars and corresponding user-network- 
commands to said speech recognition device and 

30 to receive data and messages from said device (11); 
and 

a memory (7b, 7d) to store said device or me- 
dium dependent vocabulary and/or grammars and 
corresponding user-network-commands transmit- 
35 ted by said remotely controllable device (11) con- 
nected to said speech recognition device. 

2. Speech unit (2) according to claim 1 , characterized 
by an interface (5, 6) connected to a network sys- 

to tern (1 0) to which a remotely controllable device (11) 
is connected that is controlled via said network sys- 
tem (10), to send generated user-network-com- 
mands and control-network-commands via said 
network system (10) to said remotely controllable 

45 device (1 1 ) and to receive data and messages from 
said remotely controllable device (11). 

3. Speech unit (2) according to claim 1 or 2, charac- 
terized in that said control unit (4) determines what 

50 kind of devices (11 ) are connected to said network 
system (10), to send said control-network-com- 
mands to said devices (11 ), and to receive data from 
said devices (11). 

55 4. Speech unit (2) connected to a microphone (1) for 
generating user-network-commands according to 
electric signals provided by said microphone (1) to 
control a remotely controllable device (11) charac- 
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terized by an interface (5, 6) connected to a net- 
work system (10) to which said remotely controlla- 
ble device (11) is connected that is controlled via 
said network system (10), to send generated user- 
network-commands via said network system (1 0) to 
said remotely controllable device (11) and to receive 
data and messages from said remotely controllable 
device (11) whereby said user-network-commands 
are generated from device or medium dependent 
vocabulary and/or grammars transmitted by said re- 
motely controllable device (11). 

5. Speech unit (2) according to anyone of claims 1 to 

4, characterized in that said device (11) is wired 
or wireless connected to said speech unit (2). 

6. Speech unit (2) according to anyone of claims 1 to 

5, characterized by a memory (7a, 7c) to initially 
store general vocabulary and grammars based on 
which general user-network-commands are gener- 
ated. 

7. Speech unit (2) according to anyone of claims 1 to 

6, characterized by a speaker recognition section 
(3a) to identify different users according to said 
electric signals provided by said microphone (1) to 
be able to generate speaker dependent user-net- 
work-commands. 

8. Speech unit (2) according to anyone of claims 1 to 

7, characterized by a speech synthesizer (9) to 
synthesize messages from said devices (11) and 
from said speech unit itself and to output them to a 
user via a loudspeaker. 

9. Speech unit (2) according to anyone of claims 1 to 

8, characterized by a microphone (1) and/or a 
loudspeaker. 

10. Speech unit (2) according to anyone of claims 1 to 

9, characterized in that said microphone (1) and/ 
or a loudspeaker are/is remotely connected to said 
speech unit (2) either wired or wireless either direct 
or via a network. 

11. Speech unit (2) according to anyone of claims 2 to 

10, characterized in that said interface (5. 6) is 
connected to said network system (10) via a public 
telephone network. 

12. Speech unit (2) according to anyone of claims 2 to 

11, characterized in that said interface (5, 6) is 
connected to said network system (10) via another 
network system, like a computer network system. 

13. Speech unit (2) according to anyone of claims 2 to 

1 2, characterized in that said network system (1 0) 
is an IEEE 1394 network system. 



14. Remotely controllable device (11), comprising: 

■ a control unit (1 2) to extract user-network-com- 
mands directed to said device (1 1 ) and to con- 

5 trol the functionality of said remotely controlla- 

ble device (11) according to said extracted us- 
er-network-commands, characterized in that 
said control unit (12) also extracts control-net- 
work-commands directed to said remotely con- 

1° trollable device (11 ) and, according to said ex- 

tracted control-network-commands, controls 
the transmission of device dependent user-net- 
work-commands and corresponding vocabu- 
lary and/or grammars stored in a memory (13) 

*s of said remotely controllable device (11 ) usea- 

ble by a speech unit (2) connected thereto to 
convert spoken-commands from a user into us- 
er-network-commands to control the function- 
ality of said remotely controllable device (11). 

20 

15. Remotely controllable device (11), comprising: 

a control unit (1 2) to extract user-network-com- 
mands directed to said device (11 ) and to con- 

25 trol the functionality of said remotely controlla- 

ble device (11) according to said extracted us- 
er-network-commands, characterized in that 
said control unit (12) also extracts control-net- 
work-commands directed to said remotely con- 

30 trollable device (11 ) and, according to said ex- 

tracted control-network-commands, controls 
the transmission of medium dependent user- 
network-commands and corresponding vocab- 
ulary and/or grammars stored on a medium ac- 

35 cessable by said remotely controllable device 

(11) useable by a speech unit (2) connected 
thereto to convert spoken-commands from a 
user into user-network-commands to control 
the functionality of said remotely controllable 

4 o device (11) in regard to said accessable medi- 

um or to control the functionality of said or an- 
other remotely controlled device (11). 

16. Remotely controllable device (11) according to 
45 claim 15, characterized in that said medium ac- 
cessable by said remotely controllable device (11) 
is a CD-ROM. 

17. Remotely controllable device (11) according to 
50 claim 15. characterized in that said medium ac- 
cessable by said remotely controllable device (11) 
is an internet page or information page transmitted 
via digital broadcasting. 

55 18. Remotely controllable device (11) according to an- 
yone of claims 14 to 1 7, characterized by an inter- 
face (16, 17) connected to a network system (10) 
to which other devices (11 ) and said speech unit (2) 
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are connected to receive and transmit commands, 
data and messages. 

19. Remotely controllable device (11) according to 
claim 18, characterized in that said network sys- 
tem (10) is an IEEE 1394 network system. 

20. Method of self-initialisation of a speech unit (2) con- 
nected to a remotely controllable device (11), com- 
prising the following steps: 

a) send a control-network-command to said re- 
motely controllable device (11) to control said 
device (11) to transmit device or medium de- 
pendent user-network-commands to control 
said device (1 1 ) or another device (11) and the 
corresponding vocabulary and/or grammars; 

b) receive said device or medium dependent 
user-network-commands and the correspond- 
ing vocabulary and/or grammars from said de- 
vice (11): 

c) update vocabulary and/or grammars and the 
corresponding user-network-commands in a 
memory (7). 

21. Method according to claim 20, characterized by 

the following steps: 

ask for a logical name or identifier of said device 
(11) offering the device dependent user-net- 
work-commands and the corresponding vocab- 
ulary and/or grammars: 
receive logical name or identifier; and 
assign vocabulary and grammars and corre- 
sponding user-network-commands for said de- 
vice (11) to the received logical name or iden- 
tifier when said vocabulary and/or grammars 
and the corresponding user-network-com- 
mands are updated in said memory (7) in order 
to create device (11) dependent user-network- 
commands. 

22. Method according to claim 21, characterized in 
that said logical name of said device (11) is either 
determined by a user or by said device (11 ) itself. 

23. Method according to claim 21 or 22, characterized 
in that said identifier includes address and name of 
said device (11). 

24. Method according to anyone of claims 20 to 23, 
characterized by the following steps: 

send a control-network-command to identify a 
user dependent mapping for the vocabulary 
and/or grammars and corresponding user-net- 
work-commands: 

receive name/s, identifier/s or speech sample 



of said user/s said user dependency should be 
created for; and 

assign the vocabulary and/or grammars and 
corresponding user-network-commands for 

s said device (11) to the received name/s, iden- 

tifier/s or speech sample of said user/s when 
said vocabulary and/or grammars and the cor- 
responding user-network-commands are up- 
dated in said memory (7) in order to create user 

10 dependent user-network-commands. 



Patentanspruche 

15 1. Spracheinheit (2), mit einer Spracherkennungsvor- 
richtung, die mit einem Mikrophon (1) verbunden 
ist, um Benutzer-Netzwerk-Befehle entsprechend 
elektrischer Signale zu erzeugen, die von dem Mi- 
krophon (1) bereitgestellt werden, um eine fern- 

20 steuerbare Vorrichtung (11) zu steuern, die mit der 
Spracheinheit (2) verbunden ist, gekennzeichnet 
durch 

eine Steuereinheit (4) zum Senden von Steu- 
er-Netzwerk-Befehlen zu der Vorrichtung (11), die 

25 mit der Spracherkennungsvorrichtung so verbun- 
den ist, dass die Vorrichtung (11) gerate- oder me- 
diumabhangiges Vokabular und/oder Grammatik 
sowie korrespondierende Benutzer-Netzwerk-Be- 
fehle zu der Spracherkennungsvorrichtung uber- 

30 tragt, und zum Empfangen von Daten und Nach- 
richten von der Vorrichtung (11); und 

einen Speicher (7b, 7d) zum Speichern des 
gerate- oder mediumabhangigen Vokabulars und/ 
oder der Grammatik sowie der korrespondierenden 

35 Benutzer-Netzwerk-Befehle, welche von der mit 
der Spracherkennungsvorrichtung verbundenen 
fernsteuerbaren Vorrichtung (11) ubertragen wur- 
den. 

40 2. Spracheinheit (2) nach Anspruch 1 , gekennzeich- 
net durch eine mit einem Netzwerksystem (10) ver- 
bundene Schnittstelle (5, 6), mit dem eine fernsteu- 
erbare Vorrichtung (11 ) verbunden ist, die fiber das 
Netzwerksystem (10) gesteuert wird, um erzeugte 

« Benutzer-Netzwerk-Befehle und Steuer-Netzwerk- 
Befehle iiber das Netzwerksystem (1 0) zu der fern- 
steuerbaren Vorrichtung (11) zu senden sowie Da- 
ten und Nachrichten von der fernsteuerbaren Vor- 
richtung (11) zu empfangen. 

50 

3. Spracheinheit (2) nach Anspruch 1 oder 2, dadurch 
gekennzeichnet, dass die Steuereinheit (4) be- 
stimmt, welche Art von Vorrichtungen (1 1 ) mit dem 
Netzwerksystem (10) verbunden sind, um Steu- 
55 er-Netzwerk-Befehle zu den Vorrichtungen (1 1 ) zu 
senden und Daten von den Vorrichtungen (11) zu 
empfangen. 
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4. Spracheinheit (2), die mit einem Mikrophon (1 ) ver- 
bunden ist, urn Benutzer-Netzwerk-Befehle ent- 
sprechend elektrischer Signale zu erzeugen, die 
von dem Mikrophon (1) bereitgestellt werden, um 
eine fernsteuerbare Einrichtung (11) zu steuern, 
gekennzeichnet durch eine mit einem Netzwerk- 
system (10) verbundene Schnittstelle (5, 6), an wel- 
chem die fernsteuerbare uber Netzwerksystem (1 0) 
gesteuerte Vorrichtung (11) angeschlossen ist, um 
erzeugte Benutzer-Netzwerk-Befehle iiber das 
Netzwerksystem (10) an die fernsteuerbare Vor- 
richtung (11 ) zu senden sowie Daten und Nachrich- 
ten von der fernsteuerbaren Vorrichtung (11) zu 
empfangen, wobei die Benutzer-Netzwerk-Befehle 
anhand von gerate- oder mediumabhangigem Vo- 
kabular und/oder Grammatik erzeugt werden, wel- 
che von der fernsteuerbaren Vorrichtung (11) uber- 
tragen wurden. 

5. Spracheinheit (2) nach einem der Anspruche 1 bis 

4, dadurch gekennzeichnet, dass die Vorrichtung 
(11) verdrahtet oder drahtlos an die Spracheinheit 
(2) angeschlossen ist. 

6. Spracheinheit (2) nach einem der Anspruche 1 bis 

5, gekennzeichnet durch einen Speicher (7a, 7c), 
um anfanglich allgemeines Vokabular und Gram- 
matik zu speichern, worauf basierend allgemeine 
Benutzer-Netzwerk-Befehle erzeugt werden. 

7. Spracheinheit (2) nach einem der Anspruche 1 bis 

6, gekennzeichnet durch einen Sprechererken- 
nungsteil (3a) zur Identifizierung verschiedener Be- 
nutzer entsprechend der Qber das Mikrophon (1) 
bereitgestellten elektrischen Signale, um sprecher- 
abhangige Benutzer-Netzwerk-Befehle zu erzeu- 
gen. 

8. Spracheinheit (2) nach einem der Anspruche 1 bis 

7, gekennzeichnet durch einen Sprachsynthesi- 
zer (9) zum Synthetisieren von Nachrichten von den 
Vorrichtungen (11) und der Spracheinheit selbst 
und zu deren Ausgabe Qber einen Lautsprecher an 
einen Benutzer. 

9. Spracheinheit (2) nach einem der Anspruche 1 bis 

8, gekennzeichnet durch ein Mikrophon (1) und/ 
oder einen Lautsprecher. 

10. Spracheinheit (2) nach einem der Anspruche 1 bis 

9, dadurch gekennzeichnet, dass das Mikrophon 
(1 ) und/oder ein Lautsprecher abgesetzt verdrahtet 
oder nicht verdrahtet direkt oder uber ein Netzwerk 
an die Spracheinheit (2) angeschlossen ist/sind. 

11. Spracheinheit (2) nach einem der Anspruche 2 bis 

10, dadurch gekennzeichnet, dass die Schnitt- 
stelle (5, 6) uber ein offentliches Telefonnetz an das 



Netzwerksystem (10) angeschlossen ist. 

12. Spracheinheit (2) nach einem der Anspruche 2 bis 
11, dadurch gekennzeichnet, dass die Schnitt- 

5 stelle (5, 6) iiber ein anderes Netzwerksystem, z.B. 
ein Computernetzwerksystem, an das Netzwerksy- 
stem (10) angeschlossen ist. 

13. Spracheinheit (2) nach einem der Anspruche 2 bis 
10 12, dadurch gekennzeichnet, dass das Netz- 
werksystem (10) ein IEEE 1394 Netzwerksystem 
ist. 



14. Fernsteuerbare Vorrichtung (11), mit: 

15 

einer Steuereinheit (12) zum Extrahieren von 
an die Vorrichtung (11) gerichteten Benut- 
zer-Netzwerk-Befehlen und zum Steuern der 
Funktionalitat der fernsteuerbaren Vorrichtung 

20 (11) gemali der extrahierten Benutzer-Netz- 

werk-Befehle, dadurch gekennzeichnet, 
dass die Steuereinheit (12) auch an die fern- 
steuerbare Vorrichtung (11) gerichtete Steu- 
er-Netzwerk-Befehle extrahiert und gemaG der 

25 extrahierten Steuer-Netzwerk-Befehle die 

ObertragunggerateabhangigerBenutzer-Netz- 
werk-Befehle sowie korrespondierendes Voka- 
bular und/oder Grammatik steuert, welche in ei- 
nem Speicher (13) der fernsteuerbaren Vor- 

30 richtung (11) gespeichert sind und von einer 

daran angeschlossenen Spracheinheit (2) ver- 
wendet werden konnen, um gesprochene Be- 
fehle eines Benutzers in Benutzer-Netzwerk- 
Befehle zu wandeln und damit die Funktionali- 

35 tat der fernsteuerbaren Vorrichtung (11) zu 

steuern. 

15. Fernsteuerbare Vorrichtung (11), mit: 

40 - einer Steuereinheit (12) zum Extrahieren von 
an die Vorrichtung (11) gerichteten Benut- 
zer-Netzwerk-Befehlen und zum Steuern der 
Funktionalitat der fernsteuerbaren Vorrichtung 
(11) gemaft der extrahierten Benutzer-Netz- 

45 werk-Befehle, dadurch gekennzeichnet, 

dass die Steuereinheit (12) auch an die fern- 
steuerbare Vorrichtung (11) gerichtete Steu- 
er-Netzwerk-Befehle extrahiert und gemafc der 
extrahierten Steuer-Netzwerk-Befehle die 

50 Ubertragung mediumabhangiger Benut- 

zer-Netzwerk-Befehle sowie korrespondieren- 
des Vokabular und/oder Grammatik steuert, 
welche auf einem von der fernsteuerbaren Vor- 
richtung (11)zugreifbaren Medium gespeichert 

55 werden und von einer angeschlossenen 

Spracheinheit (2) verwendet werden konnen, 
um gesprochene Befehle eines Benutzers in 
Benutzer-Netzwerk-Befehle zu wandeln und 
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damit die Funktionalitat der fernsteuerbaren 
Vorrichtung (11) bezuglich des zugreifbaren 
Mediums oder die Funktionalitat der oder einer 
anderen ferngesteuerten Vorrichtung (11) zu 
steuern. 5 

16. Fernsteuerbare Vorrichtung (11 ) nach Anspruch 1 5, 
dadurch gekennzeich net, dass das von der fern- 
steuerbaren Vorrichtung (11) zugreifbare Medium 
eine CD-ROM ist. w 

17. Fernsteuerbare Vorrichtung (11) nach Anspruch 15, 
dadurch gekennzeichnet, dass das der fernsteu- 
erbaren Vorrichtung (11) zugreifbare Medium eine 
Internetseite oder eine per digitalem Rundfunk 15 
ubertragene Informationsseite ist. 

18. Fernsteuerbare Vorrichtung (11) nach einem der 
Anspruche 14 bis 17, gekennzeichnet durch eine 

mit einem Netzwerksystem (10) verbundene 20 
Schnittstelle (16, 17), an welchem andere Vorrich- 
tungen (11) und die Spracheinheit (2) angeschlos- 
sen sind, urn Befehle, Daten und Nachrichten zu 
senden und zu empfangen. 

25 

19. Fernsteuerbare Vorrichtung (11) nach Anspruch 18, 
dadurch gekennzeichnet, dass das Netzwerksy- 
stem (10) ein IEEE 1394 Netzwerksystem ist. 

20. Verfahren zur Selbstinitialisierung einer an eine 30 
fernsteuerbare Vorrichtung (11) angeschlossenen 
Spracheinheit (2), mit den folgenden Schritten: 

a) senden eines Steuer-Netzwerk-Befehls an 

die fernsteuerbare Vorrichtung (11), urn die 35 
Vorrichtung (11)zu steuern, gerate- oder medi- 
umabhangige Benutzer-Netzwerk-Befehle, mit 
denen die Vorrichtung (11) oder eine andere 
Vorrichtung (11) gesteuert wird, und das korre- 
spondierende Vokabular und die Grammatik zu 40 
senden. 

b) empfangen von gerate- oder mediumabhan- 
gigen Benutzer-Netzwerk-Befehlen und des 
korrespondierenden Vokabulars und/oder der 
Grammatik von der Vorrichtung (11); 45 

c) aktualisieren des Vokabulars und/oder der 
Grammatik und der korrespondierenden Be- 
nutzer-Netzwerk-Befehle in einem Speicher 
(7). 

50 

21. Verfahren nach Anspruch 20, gekennzeichnet 
durch die folgenden Schritte: 

erfragen eines logischen Namens oder eines 
Identifikators der Vorrichtung (11), welche die 55 
gerateabhangigen Benutzer-Netzwerk-Befeh- 
le und das korrespondierende Vokabular und/ 
oder die Grammatik angeboten hat; 



empfangen des logischen Namens oder des 
Identifikators; und 

zuweisen des Vokabulars, der Grammatik und 
der korrespondierenden Benutzer-Netzwerk- 
Befehle fur die Vorrichtung (11) zu dem emp- 
fangenen logischen Namen oder Identifikator, 
nachdem das Vokabular und/oder die Gram- 
matik und die korrespondierenden Benut- 
zer-Netzwerk-Befehle in dem Speicher (7) ak- 
tualisiert wurden, urn fur die Vorrichtung (11) 
gerateabhangige Benutzer-Netzwerk-Befehle 
zu erzeugen. 

22. Verfahren nach Anspruch 21, dadurch gekenn- 
zeichnt, dass der logische Name der Vorrichtung 
(11) entweder von einem Benutzer oder von der 
Vorrichtung (11) selbst bestimmt wird. 

23. Verfahren nach Anspruch 21 oder 22, dadurch ge- 
kennzeichnet, dass der Identifikator die Adresse 
und den Namen der Vorrichtung (11) beinhaltet. 

24. Verfahren nach einem der Anspruche 20 bis 23, ge- 
kennzeichnet durch die folgenden Schritte: 

senden eines Steuer-Netzwerk-Befehls zur 
Identifikation einer benutzerabhSngigen Zuord- 
nung des Vokabulars und/oder der Grammatik 
zu korrespondierenden Benutzer-Netzwerk- 
Befehlen; 

empfangen von Name/n, Identifikator/en oder 
einer Sprachprobe des/der Benutzer/s, fur den/ 
die die Benutzerabhangigkeit geschaffen wer- 
den soil; und 

zuweisen des Vokabulars und/oder der Gram- 
matik und der korrespondierenden Benut- 
zer-Netzwerk-Befehle fur die Vorrichtung (11) 
zu dem/den empfangenen Name/n, dem/den 
Identifikator/en oder der Sprachprobe des/der 
Benutzer/s, nachdem das Vokabular und/oder 
die Grammatik und die korrespondierenden 
Benutzer-Netzwerk-Befehle in dem Speicher 
(7) aktualisiert wurden, urn benutzerabhangige 
Benutzer-Netzwerk-Befehle zu erzeugen. 



Revendications 

1. Unite vocale (2), comprenantundispositifde recon- 
naissance de la parole, qui est connecte a un mi- 
crophone (1) afin de produire des instructions de 
reseau d'utilisateur en fonction de signaux electri- 
ques fournis par ledit microphone (1 ) pour comman- 
der un dispositif (11 ) pouvant etre commande a dis- 
tance connecte a ladite unite vocale (2), caracteri- 
see par : 

une unite de commande (4) servant a envoyer 
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des instructions de reseau de commande audit 
dispositif (11) connecte audit dispositif de re- 
connaissance de la parole de facon que ledit 
dispositif (11) transmette du vocabulaire et, ou 
bien, des regies de grammaire dependant d'un 5 
dispositif ou d'un support ainsi que des instruc- 
tions de reseau d'utilisateur correspondantes 
audit dispositif de reconnaissance de la parole 
et a recevoir des donnees et des messages de 
la part dudit dispositif (11) ; et 10 
une memoire (7b, 7d) servant a stocker ledit vo- 
cabulaire et, ou bien, lesdites regies de gram- 
maire dependant d'un dispositif ou d'un support 
et lesdites instructions de reseau d'utilisateur 
correspondantes transmis par ledit dispositif 15 
pouvant etre commande a distance (11) con- 
necte audit dispositif de reconnaissance de la 
parole. 

2. Unite vocale (2) selon la revendication 1 , caracte- 20 
risee par une interface (5, 6) connectee a un sys- 
teme de reseau (10) auquel un dispositif pouvant 
etre commande a distance (11) est connecte, qui 

est commande via ledit systeme de reseau (10), 
afin d'envoyer des instructions de reseau d'utilisa- 25 
teur et des instructions de reseau de commande 
produites via ledit systeme de reseau (10) audit dis- 
positif pouvant etre commande a distance (11) et 
de recevoir des donnees et des messages de la part 
dudit dispositif pouvant etre commande a distance 30 
(11). 

3. Unite vocale (2) selon la revendication 1 ou 2, ca- 
racterisee en ce que ladite unite de commande (4) 
determine quel type de dispositifs (11 ) sont connec- 35 
tes audit systeme de reseau (10) afin d'envoyer les- 
dites instructions de reseau de commande auxdits 
dispositifs (1 1 ) et de recevoir des donnees de la part 
desdits dispositifs (11). 

40 

4. Unite vocale (2) connectee a un microphone (1 ) afin 
de produire des instructions de reseau d'utilisateur 
en fonction de signaux electriques fournis par ledit 
microphone (1) pour commander un dispositif (11) 
pouvant etre commande a distance, caracterisee 45 
par une interface (5, 6) connectee a un systeme de 
reseau (10)auquel ledit dispositif pouvant etre com- 
mande a distance (11) est connecte, qui est com- 
mande via ledit systeme de reseau (10), pour en- 
voyer des instructions de reseau d'utilisateur pro- 50 
duites, via ledit systeme de reseau (10), audit dis- 
positif pouvant etre commande a distance (11) et 
pour recevoir des donnees et des messages de la 
part dudit dispositif pouvant etre commande a dis- 
tance (11), si bien que lesdites instructions de re- 55 
seau d'utilisateur sont produites a partir d'un voca- 
bulaire et, ou bien, de regies de grammaire depen- 
dant d'un dispositif ou d'un support, transmis par le- 



dit dispositif pouvant etre commande a distance 
(11). 

5. Unite vocale (2) selon I'une quelconque des reven- 
dications 1 a 4, caracterisee en ce que ledit dis- 
positif est connecte parfilsou sans fils a ladite unite 
vocale (2). 

6. Unite vocale (2) selon I'une quelconque des reven- 
dications 1 a 5, caracterisee par une memoire (7a, 
7c) servant a initialement stocker du vocabulaire 
general et des regies de grammaire sur la base des- 
quels des instructions de reseau d'utilisateur gene- 
rales sont produites. 

7. Unite vocale (2) selon I'une quelconque des reven- 
dications 1 a 6, caracterisee par une section (3a) 
de reconnaissance du locuteur, servant a identifier 
des utilisateurs differents en fonction desdits si- 
gnaux electriques fournis par ledit microphone (1) 
afin de pouvoir produire des instructions de reseau 
d'utilisateur dependant du locuteur. 

8. Unite vocale (2) selon I'une quelconque des reven- 
dications 1 a 7, caracterisee par un synthetiseur 
de parole (9) servant a synthetiser des messages 
venant desdits dispositifs (11 ) et de ladite unite vo- 
cale elle-meme et a les delivrer a un utilisateur via 
un haut-parleur. 

9. Unite vocale (2) selon I'une quelconque des reven- 
dications 1 a 8, caracterisee par un microphone 
(1) et, ou bien, un haut-parleur. 

10. Unite vocale (2) selon I'une quelconque des reven- 
dications 1 a 9, caracterisee en ce que ledit micro- 
phone (1) et, ou bien, un haut-parleur sont connec- 
ts a distance a ladite unite vocale (2) par fils ou 
sans fils, directement ou bien via un reseau. 

11. Unite vocale (2) selon I'une quelconque des reven- 
dications 2 a 10, caracterisee en ce que ladite in- 
terface (5, 6) est connectee audit systeme de re- 
seau (10) via un reseau telephonique public. 

12. Unite vocale (2) selon I'une quelconque des reven- 
dications 2 a 11 , caracterisee en ce que ladite in- 
terface (5, 6) est connectee audit systeme de re- 
seau (10) via un autre systeme de reseau, tel qu'un 
systeme de reseau d'ordinateurs. 

13. Unite vocale (2) selon I'une quelconque des reven- 
dications 2 a 12, caracterisee en ce que ledit sys- 
teme de reseau (10) est un systeme de reseau du 
type IEEE 1394. 

14. Dispositif (11) pouvant etre commande a distance, 
comprenant : 
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17. Dispositif pouvant etre commande a distance (11) 
selon la revendication 15, caracterise en ce que 
ledit support accessible audit dispositif pouvant etre 
commande a distance (11) est une page d'lnternet 

5 ou une page d'informations transmise via une diffu- 
sion numerique. 

18. Dispositif pouvant etre commande a distance (11) 
selon I'une quelconque des revendications 14 a 1 7, 

10 caracterise par une interface (16,17) connectee a 
un systeme de reseau (10) auquel d'autres dispo- 
sitifs (11) et ladite unite vocale (2) sont connectes 
afin de recevoir et de transmettre des instructions, 
des donnees et des messages. 

15 

19. Dispositif pouvant etre commande a distance (11) 
selon la revendication 18, caracterise en ce que 
ledit systeme de reseau (10) est un systeme de re- 
seau du type IEEE 1394. 

20 

20. Precede d'auto-initialisation d'une unite vocale (2) 
connectee a un dispositif (11) pouvant etre com- 
mande a distance, comprenant les operations 
suivantes : 

25 

a) envoyer une instruction de reseau de com- 
mande audit dispositif pouvant etre commande 
a distance (11 ) afin de commander audit dispo- 
sitif (11) de transmettre des instructions de re- 
30 seau d'utilisateur dependant d'un dispositif ou 

d'un support afin de commander ledit dispositif 
(11) ou un autre dispositif (11), ainsi que le vo- 
cabulaire et, ou bien, les regies de grammaire 
correspondants ; 
35 b) recevoir lesdites instructions de reseau d'uti- 

lisateur dependant d'un dispositif ou d'un sup- 
port et le vocabulaire et, ou bien, les regies de 
grammaire correspondants en provenance du- 
dit dispositif (11) ; 
40 c) mettre a jour le vocabulaire et, ou bien, les 

regies de grammaire et les instructions de re- 
seau d'utilisateur correspondantes dans une 
memoire (7). 

45 21 . Precede selon la revendication 20, caracterise par 
les operations suivantes : 



une unite de commande (12) servant a extraire 
des instructions de reseau d'utilisateur adres- 
sees audit dispositif (11) et a commander la 
fonctionnalite dudit dispositif pouvant etre com- 
mande a distance (11 ) en fonction desdites ins- 
tructions de reseau d'utilisateur extraites, 

caracterise en ce que ladite unite de com- 
mande (12) extrait egalement des instructions de 
reseau de commande adressees audit dispositif 
pouvant etre commande a distance (11 ) et, en fonc- 
tion desdites instructions de reseau de commande 
extraites, commande la transmission d'instructions 
de reseau d'utilisateur dependant d'un dispositif et 
de vocabulaire et, ou bien, de regies de grammaire 
correspondants stockes dans une memoire (1 3) du- 
dit dispositif pouvant etre commande a distance 
(11) utilisables par une unite vocale (2) qui lui est 
connectee de facon a convertir des instructions vo- 
cales venant d'un utilisateur en instructions de re- 
seau d'utilisateur pour commander la fonctionnalite 
dudit dispositif pouvant etre commande a distance 
(11). 

15. Dispositif (11) pouvant etre commande a distance, 
comprenant : 

une unite de commande (1 2) servant a extraire 
des instructions de reseau d'utilisateur adres- 
sees audit dispositif (11) et a commander la 
fonctionnalite dudit dispositif pouvant etre com- 
mande a distance (11 ) en fonction desdites ins- 
tructions de reseau d'utilisateur extraites, 

caracterise en ce que ladite unite de com- 
mande (12) extrait egalement des instructions de 
reseau de commande adressees audit dispositif 
pouvant etre commande a distance (11 ) et, en fonc- 
tion desdites instructions de reseau de commande 
extraites, commande la transmission d'instructions 
de reseau d'utilisateur dependant d'un support et 
de vocabulaire et, ou bien, de regies de grammaire 
correspondants stockes sur un support accessible 
audit dispositif pouvant etre commande a distance 
(11) utilisables par une unite vocale (2) qui lui est 
connectee de facon a convertir des instructions vo- 
cales venant d'un utilisateur en instructions de re- 
seau d'utilisateur pour commander la fonctionnalite 
dudit dispositif pouvant etre commande a distance 
(11), en ce qui concerne ledit support accessible, 50 
ou bien pour commander la fonctionnalite dudit ou 
d'un autre dispositif commande a distance (11). 

16. Dispositif pouvant etre commande a distance (11) 
selon la revendication 15, caracterise en ce que 55 
ledit support accessible audit dispositif pouvant etre 
commande a distance (11) est un CD-ROM. 



demander un nom logique ou un identificateur 
audit dispositif (11) offrant les instructions de 
reseau d'utilisateur dependant d'un dispositif et 
le vocabulaire et, ou bien, les regies de gram- 
maire correspondants ; 
recevoir le nom logique ou I'identificateur, et 
affecter le vocabulaire et les regies de gram- 
maire ainsi que les instructions de reseau d'uti- 
lisateur correspondantes qui sont relatifs audit 
dispositif (11) au nom logique ou a I'identifica- 
teur recu lorsque ledit vocabulaire et, ou bien, 
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les regies de grammaire et les instructions de 
reseau d'utilisateur correspondantes sont mis 
a jour dans ladite memoire (7) afin de creer des 
instructions de reseau d'utilisateur dependant 
d'un dispositif (11). 5 

22. Precede selon la revendication 21 , caracterise en 
ce que ledit nom logique dudit dispositif (11 ) est de- 
termine par un utilisateur ou bien par ledit dispositif 
(11) lui-meme. 10 

23. Precede selon la revendication 21 ou 22, caracte- 
rise en ce que ledit identificateur comporte I'adres- 
se et le nom dudit dispositif (11 ). 

15 

24. Precede selon I'une quelconque des revendications 
20 a 23, caracterise par les operations suivantes : 

envoyer une instruction de reseau de comman- 
de afin d'identifier une mise en correspondan- 20 
ce, dependant de I'utilisateur, pour le vocabu- 
laire et, ou bien, les regies de grammaire et les 
instructions de reseau d'utilisateur 
correspondantes ; 

recevoir le ou les noms, le ou les identificateurs 25 
ou bien un echantillon de parole dudit ou des- 
dits utilisateurs pour lesquels la dependance 
avec I'utilisateur doit etre creee ; et 
affecter le vocabulaire et, ou bien, les regies de 
grammaire ainsi que les instructions de reseau 30 
d'utilisateur correspondantes qui sont relatifs 
audit dispositif (11) au ou aux noms, a I'identi- 
ficateurou aux identificateurs ou bien a I'echan- 
tillon de parole dudit ou desdits utilisateurs lors- 
que ledit vocabulaire et, ou bien, lesdites regies 35 
de grammaire ainsi que lesdites instructions de 
reseau d'utilisateur correspondantes sont mis 
a jour dans ladite memoire (7) afin de creer des 
instructions de reseau d'utilisateur dependant 
de I'utilisateur. 40 
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Figure 3: 
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Figure 5: 
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Figure 8: 



I DIALOG GRAMMAR! 



_S 



1 



S: Channel? \~*\ U_CHANNEl] — > [ S: Today?) - ^ U JW DATE |-»| S: Which film? f-^j U_FJLM 



|S:pate?|-*4u DATEi 



I GRAMMAR FOR WORD SEQUENCE I 



U_Y/N/DATE 



v/^ S42 



S45 



S46 



U_DATE: 

I VOCABULARIES I 

U_YES: yes, yeh 

U_NO: no 

NO_1_31 : one, two, .... thirty-one, first, second, .... thirty-first 

NO_1_12: one, .... twelve, first, .... twelfth 
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Figure 10: 
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